Parallelization  of  the  DG  method  for  the  Navier- 
Stokes  Equations 

Summary 

Accurate  predictions  of  skin  friction  and  thermal  loads  of  high  speed  complex 
flows,  in  both  simple  and  nontrivial  geometries,  require  good  shock  capturing 
capability  and  high  accuracy  in  the  viscous  flow  region.  High  order  discontin¬ 
uous  Galerkin  (DG)  discretizations  possess  features  that  make  them  attrac¬ 
tive  for  accurate  computation  of  complex  flows  with  strong  shocks.  A  key 
ingredient  that  would  make  the  DG  method  more  attractive  for  these  com¬ 
putations,  is  application  of  p-adaptive  procedures  that  ensure  accurate  cap¬ 
turing  of  discontinuities  with  low  order  expansions  and  resolution  of  smooth 
complex  features,  such  as  vortices  and  shear  layers,  with  higher  order  ac¬ 
curacy.  A  limiting  procedure  of  DG  discretizations  capable  of  commuting 
high  speed  flows  with  strong  shocks  around  complex  geometries,  using  a  p- 
adaptive  procedure  on  mixed  type  (quadrilateral  and  triangular)  meshes  was 
developed  and  preliminary  results  were  presented  in  the  previous  report.  In 
this  report  further  validation  of  the  limiting  approach  is  presented  for  stan¬ 
dard  computationally  demanding  flow  cases,  such  as  the  Mach  reflection  and 
the  wind  tunnel  with  a  step.  Application  examples  for  both  quadrilateral, 
triangular  and  mixed  type  meshes  are  shown. 

These  large  scale  computations  were  made  possible  by  parallelizing  the 
code  using  domain  decomposition  and  MPI.  Viscous  terms  were  added  for  the 
DG  method  using  the  local  DG  approach.  Incorporation  of  an  implicit  time 
marching  scheme,  which  will  make  possible  high  Reynolds  number  compu¬ 
tations,  is  underway  for  a  single  processor.  Implementation  of  implicit  time 
marching  with  domain  decomposition  is  more  involved  and  requires  use  of  a 
dual  time  stepping  for  time  accurate  computations.  Developments  of  capabil¬ 
ities  to  compute  viscous  flow  numerical  solutions  with  implicit  time  stepping 
for  multiprocessor  will  be  presented  in  the  next  report. 
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Parallelization 


The  compact  form  of  the  DG  discretization  stencil  (information  is  needed 
only  from  the  immediate  neighbors)  permits  efficient  implementation  on  par¬ 
allel  processors.  The  data  structure  of  the  code  was  designed  in  order  to 
meet  the  requirements  of  the  DG  method  on  mixed  type  element  meshes  and 
p-adaptive  expansions.  Moreover,  it  allows  application  of  the  domain  decom¬ 
position  method  through  MPI  without  serious  additional  coding  complexity. 

Every  simulation  with  domain  decompositions  initiates  with  the  par¬ 
titioning  of  the  computational  domain  using  free  available  software  Metis 
(http:/ /glaros. dtc.umn.edu/gkhome/views/metis).  In  order  to  handle  mixed 
type  meshes,  a  graph  file  representing  the  mesh  is  initially  created  and  subse¬ 
quently  partitioning  of  the  graph  in  to  subdomains  leads  to  the  final  decom¬ 
position  of  the  mesh.  Every  partition  (subdomain)  includes  layers  of  elements 
belonging  to  neighboring  partitions.  This  set  of  elements  is  the  receiving  list 
for  a  partition,  and  the  set  of  elements  of  the  partition  sharing  the  same 
edges  with  the  receiving  list,  is  the  sending  list  to  neighboring  partitions. 

A  TVD  Runge-Kutta  method  was  used  for  time  marching.  Limiting  is 
applied  after  every  stage  of  the  RK  cycle.  Finally,  an  exchange  between 
the  partitions  of  the  solution  is  also  performed  at  the  end  of  each  stage. 
For  a  parallel  p-adaptive  DG  code,  all  coefficients  of  the  expansion  for  the 
highest  order  polynomial  approximation  are  exchanged  between  partitions  of 
the  mesh. 

The  parallel  algorithm  for  every  partition  is  as  follows: 

1 .  Transfer  the  solution  (expansion  coefficients)  at  the  end  of  the  RK  cycle 
between  partitions.  Wait  for  the  transfer  to  complete. 

2.  Apply  the  limiter. 

3.  Transfer  of  all  coefficients  of  the  highest  polynomial  application.  Wait 
for  the  transfer  to  complete. 

4.  Assign  solution  coefficients  from  PO  up  to  the  highest  polynomial  ap¬ 
plication. 

5.  Compute  the  volume  integrals. 

6.  Compute  the  line  integrals. 

7.  Compute  residual  for  every  element  belonging  to  the  partition. 

8.  Enter  the  next  RK  stage. 

9.  Complete  the  RK  cycle  and  repeat. 
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Results 


Results  from  parallel  computations  of  flows  with  strong  shocks  on  single-  and 
mixed- type  meshes  are  presented  next.  The  supersonic  flow  over  a  cylinder 
is  the  first  example.  The  single  type  computational  mesh  with  triangular 
elements  and  the  computed  density  field  for  flow  at  M=2.0  is  are  shown  in 
Figs.  1-2.  The  same  flow  was  computed  on  a  mesh  with  arbitrary  quadri¬ 
lateral  elements  (see  Figs.  3  and  4).  This  computation  was  not  possible 
with  the  original  limiting  procedure  suggested  by  Cockburn  and  Shu  [1]  and 
difficult  to  implement  with  limiting  approaches  presented  in  Ref.  2-6,  but 
it  is  straightforward  with  our  new  limiting  approach.  Next,  an  example  of 
a  computation  on  a  mixed-type  mesh  is  presented.  The  mesh  and  the  com¬ 
puted  density  field  is  shown  in  Fig.  5.  For  this  computation,  a  p-adaptive 
procedure  is  applied  where  the  flow  near  the  discontinuity  is  computed  with 
PI  polynomials  as  flagged  by  the  limiter  while  the  rest  of  the  smooth  flow 
field  is  computed  with  P2  polynomials.  A  comparison  of  the  computed  den¬ 
sity  along  the  symmetry  line  is  shown  in  Fig.  6.  The  agreement  with  the 
quadrilateral  mesh  solution  is  excellent. 

Computations  for  two  classical  examples  are  presented  next.  These  are 
the  M=3.0  tunnel  with  a  step  and  the  double  Mach  reflection  of  a  strong 
shock  at  Mach=10  (P.  Woodward  and  P.  Colella,  ’’The  numerical  simulation 
of  two-dimensional  fluid  flow  with  strong  shocks,”  Journal  of  Computational 
Physics,  Vol.  54,  No.  1,  1984,  pp.  115-173).  The  computational  domain  for 
the  tunnel  with  a  step  discretized  with  a  mixed  type  mesh  and  the  partition 
of  this  mesh  to  subdomains  for  parallel  processing  is  shown  in  Fig.  7.  At 
the  corner  the  mesh  was  refined  in  order  to  diminish  the  Mach  stem  that  is 
created  from  the  artificial  entropy  layers  caused  by  the  sharp  corner.  The 
computed  flow  field  and  the  elements  flagged  for  limiting  are  shown  in  Fig. 
8.  The  elements  flagged  for  limiting  for  the  double  Mach  reflection  problem 
computation  are  shown  in  Fig.  9  and  the  computed  flow  field  is  shown 
in  Fig.  10.  For  these  computations  the  elements  flagged  for  limiting  are 
updated  continuously  during  the  time  accurate  computation.  Computations 
with  higher  order  expansions  using  p-adaptive  procedures  are  underway  and 
they  will  presented  at  the  AIAA  ASM  meeting  in  January  2011. 


Conclusions  and  Outlook 

The  DG  code  was  parallelized  using  domain  decomposition  and  MPI.  The 
new  limiting  procedure  for  DG  discretizations  was  then  applied  for  standard 
computationally  demanding  test  problems.  The  proposed  approach  applied 


3 


to  quadrilateral,  triangular,  and  mixed-type  meshes.  The  extension  of  the 
limiting  approach  in  three  dimensions  is  underway.  Numerical  solutions  of 
viscous  flows  with  shocks  will  be  computed  using  a  p- adaptive  procedure. 
For  high  Reynolds  number  flows  and  long  time  integration  with  small  size 
meshes  use  of  an  implicit  time  marching  scheme  is  necessary. 
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Figure  1.  Triangular  element  mesh  over  the  cylinder 


Figure  2.  Computed  density  field  at  M=2. 
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Figure  3.  Quadrilateral  element  mesh  over  the  cylinder 


Figure  4.  Computed  density  field  at  M=2. 
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Density 


A 


Figure  5.  Mixed-type  mesh  for  the  computation  of  supersonic  flow. 


Figure  6  Comparison  of  the  density  along  the  symmetry  stagnation  line 
obtained  with  quadrilateral  (PI)  and  mixed  type  meshes  (p2  adaptive). 
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(elements  flagged  for  limiting) 


Figure  8.  Elements  flagged  for  limiting  (top)  and  computed  density  and 
pressure  for  the  flow  in  a  tunnel  with  a  step. 
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Figure  9.  Elements  flagged  for  limiting 


Figure  11.  Computed  density  field. 


